Text Analysis of the Conflict in Gaza Reveals the Civilian Impact
Background: 10/7/2023 Hamas attacks Israel killing 1,200 people, Israel retaliates. Estimated dead ~44,000.
Problem: Hamas Ministry of Health (MoH) reports casualties, cannot be corroborated.
Claim: “Women & children disproportionately killed” UN-OHCHR
Research Question: To what extent can open-source data be used to identify patterns in the targeting of Palestinian civilians in Gaza?
Data Source: Airwars tracks civilian incidents from conflicts.
Web Scrape: ~ 800 incidents (~9,000 deaths, ~10,000 injured) store in SQLite.
JSON Parse: Incident characteristics & casualties. Extract geocoordinates from “Assessment” (65% of incidents).
Reverse Geocoding: Submit incident coordinate queries to the Nominatim API, keep location type (i.e., school, hospital).
Sentiment Analysis: Derive emotional tone from assessments. DistilRoBERTa-base, classifies text into Ekman’s 6 basic emotions.
Clustering Analysis: Explore geographic w/ sentiment features to unpack geographic associations with child & women casualties.